Pretty PDF Challenge adjusted to match format of entire portfolio
Refer to File_Fragments folder for individual files used to knit this portfolio
Detail the code you used to create, initialize, and push your portfolio repo to GitHub. This will be helpful as you will need to repeat many of these steps to update your porfolio throughout the course.
git config –global user.name “Jonah Lin”
git config – global user.email “1jonahlin1@gmail.com”
… Set up MICB425_Materials folder in relevant place …
mkdir MICB425_Portfolio
cd MICB425_Portfolio
git init
git add .
git commit -m “State commit message here”
git remote add origin git@github.com:IStrykerI/MICB425_Portfolio.git
git remote -v
git push -u origin master
… Needed key to get to this repo since it’s locked. Regular submit codes below …
git add .
git commit -m “State commit message here”
git push
# Set-up
# install.packages("tidyverse")
library(tidyverse)
## -- Attaching packages --------------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 2.2.1 v purrr 0.2.4
## v tibble 1.4.2 v dplyr 0.7.4
## v tidyr 0.8.0 v stringr 1.3.0
## v readr 1.1.1 v forcats 0.3.0
## -- Conflicts ------------------------------------------------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
metadata <- read.table(file="Saanich.metadata.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN", "NA", "."))
OTUdata <- read.table(file="Saanich.OTU.txt", header=TRUE, row.names=1, sep="\t", na.strings=c("NAN", "NA", "."))
# source("https://bioconductor.org/biocLite.R")
# biocLite("phyloseq")
library(phyloseq)
load("phyloseq_object.RData")
physeq_percent = transform_sample_counts(physeq, function(x) 100 * x/sum(x))
# Exercise 1
# Plot of NH4 with purple triangles
ggplot(metadata, aes(x=NH4_uM, y=Depth_m)) +
geom_point(color="Purple", shape=17)
# Exercise 2
# Convert Celsius to Fahrenheit and create dot plot of temperature in Fahrenheit against depth
Fahr_Data = metadata %>% mutate(Temperature_F = (Temperature_C*9/5) + 32) %>% select(Temperature_F, Depth_m)
ggplot(Fahr_Data, aes(x=Temperature_F, y=Depth_m)) + geom_point()
# Exercise 3
# Title addition with more descriptive x and y axis labels
plot_bar(physeq_percent, fill="Class") +
geom_bar(aes(fill=Class), stat="identity") + ggtitle("Classes from 10 to 200 m in Saanich Inlet") + xlab("Sample Depth") + ylab("Percent Relative Abundance") + theme(plot.title = element_text(size = 6))
# Exercise 4
# Select nutrient concentrations
Nutrient_Concentrations = metadata %>% select(Depth_m, O2_uM, PO4_uM, SiO2_uM, NO3_uM, NH4_uM, NO2_uM)
# Collapse all nutrient concentrations into depths
Nutrient_Depths = gather(Nutrient_Concentrations, "Nutrients", "uM", -1)
# Plot faceted figure of all nutrient concentrations
ggplot(Nutrient_Depths, aes(x=Depth_m, y=uM)) + geom_point() + geom_line() + facet_wrap(~Nutrients, scales="free_y")
Paste your code from the in-class activity of recreating the example PDF.
The following assignment is an exercise for the reproduction of this .html document using the RStudio and RMarkdown tools we’ve shown you in class. Hopefully by the end of this, you won’t feel at all the way this poor PhD student does. We’re here to help, and when it comes to R, the internet is a really valuable resource. This open-source program has all kinds of tutorials online.
http://phdcomics.com/ Comic posted 3-18-2018
The goal of this R Markdown html challenge is to give you an opportunity to play with a bunch of different RMarkdown formatting. Consider it a chance to flex your RMarkdown muscles. Your goal is to write your own RMarkdown that rebuilds this html document as close to the original as possible. So, yes, this means you get to copy my irreverant tone exactly in your own Markdowns. It’s a little window into my psyche. Enjoy =)
Hint: Go to the PhD Comics Website to see if you can find the image above.
If you can’t find that exact image, just find a comparable image from the PhD Comics website and include it in your markdown.
Let’s be honest, this header is a little arbitrary. But show me that you can reproduce headers with different levels please. This is a level 3 header, for your reference (You can most easily tell this from the table of contents).
Perhaps you’re already really confused by the whole markdown thing. Maybe you’re so confused that you’ve forgotton how to add. Never fear! A calculator R is here:
1231521+12341556280987
## [1] 1.234156e+13
Or maybe, after you’ve added those numbers, you feel like it’s about time for a table!
I’m going to leave all the guts of the coding here so you can see how libraries (R packages) are loaded into R (More on that later). It’s not terribly pretty, but it hints at how R works and how you will use it in the future. The summary function used below is a nice data exploration function that you may use in thefuture.
library(knitr)
kable(summary(cars),caption="I made this table with kable in the knitr package library")
| speed | dist | |
|---|---|---|
| Min. : 4.0 | Min. : 2.00 | |
| 1st Qu.:12.0 | 1st Qu.: 26.00 | |
| Median :15.0 | Median : 36.00 | |
| Mean :15.4 | Mean : 42.98 | |
| 3rd Qu.:19.0 | 3rd Qu.: 56.00 | |
| Max. :25.0 | Max. :120.00 |
And now you’ve almost finished your first RMarkdown! Feeling excited? We are! In fact, we’re so excited that maybe we need a big finale eh? Here’s ours! Include a fun GIF of your choice!
Describe the numerical abundance of microbial life in relation to ecology and biogeochemistry of Earth systems.
What were the primary methodological approaches used?
Primary Methodological Approaches: Usage of various papers for estimating the number of prokaryotes in various habitats, total C content (Mainly), turnover times, and cellular production rates
Comment on the emergence of microbial life and the evolution of Earth systems
Indicate the key events in the evolution of Earth systems at each approximate moment in the time series. If times need to be adjusted or added to the timeline to fully account for the development of Earth systems, please do so.
Describe the dominant physical and chemical characteristics of Earth systems at the following waypoints:
Describe the numerical abundance of microbial life in relation to the ecology and biogeochemistry of Earth systems.
What are the primary prokaryotic habitats on Earth and how do they vary with respect to their capacity to support life? Provide a breakdown of total cell abundance for each primary habitat from the tables provided in the text.
Primary Prokaryotic Habitats on Earth:How do they vary with respect to their capacity to support life:
Aquatic environments have the highest rate of cellular productivity while subsurface environments have the lowest rate of cellular productivity between the 3 habitats (Even though they have the highest population).
What is the estimated prokaryotic cell abundance in the upper 200 m of the ocean and what fraction of this biomass is represented by marine cyanobacterium including Prochlorococcus? What is the significance of this ratio with respect to carbon cycling in the ocean and the atmospheric composition of the Earth?
Estimated Prokaryotic Cell Abundance in Upper 200m of Ocean: 3.6 * 1028
Fraction represented by marine cyanobacterium (+ Prochlorococcus): (4 * 104) / (5 * 105) * 100 = 8%
Significance of this ratio with respect to C cycling in ocean and atmospheric composition of Earth:
Approx. 8% of these prokaryotes (Cyanobacteria + Prochlorococcus) are contributing to the conversion of CO2 to O2
What is the difference between an autotroph, heterotroph, and a lithotroph based on information provided in the text?
Difference Between Autotroph/Heterotroph/Lithotroph:Based on information provided in text.
Based on information provided in the text and your knowledge of geography what is the deepest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this depth?
Deepest Habitat: 4 km (Terrestrial) and 10.9 - 14.9 km (Marine)
Primary Limiting Factor: Temperature (125oC)
Based on information provided in the text your knowledge of geography what is the highest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this height?
Highest Habitat: 77 km (In reality ~20 km above surface)
Primary Limiting Factor(s): Stable Space/Resources/Radiation/Lack of Moisture
Based on estimates of prokaryotic habitat limitation, what is the vertical distance of the Earth’s biosphere measured in km?
Vertical Distance of Earth’s Biosphere: ~24 - 44 km
How was annual cellular production of prokaryotes described in Table 7 column four determined? (Provide an example of the calculation)
Annual Cellular Production of Prokaryotes:
Population * (Turnover/Yr) = Cells/Yr
3.6 * 1028 * 365 Days/16 Turnovers = 8.2 * 1029 Cells/Yr
What is the relationship between carbon content, carbon assimilation efficiency and turnover rates in the upper 200m of the ocean? Why does this vary with depth in the ocean and between terrestrial and marine habitats?
Relationship between C content, C assimilation efficiency, and turnover rates in the upper 200m of ocean:
Due to the high turnover rates in the upper 200m of ocean and the estimated low C assimilation efficiency (0.2), the C content will be low since the majority of C will be used to support the turnover of prokaryotes and not assimilated.
This varies with depth in ocean and between terrestrial and marine habitats because as the depth increases, the turnover rate decreases due to low metabolic activity. This in turn leads to higher C contents since the turnover of prokaryotes in deeper depths becomes low enough for C to become assimilated.
How were the frequency numbers for four simultaneous mutations in shared genes determined for marine heterotrophs and marine autotrophs given an average mutation rate of 4 x 10-7 per DNA replication? (Provide an example of the calculation with units. Hint: cell and generation cancel out)
Frequency Number for 4 Simultaneous Mutations in Shared Genes:
Average Mutation Rate = 4 * 10-7 Per DNA Replication 365 / 16 = 22.8 Turnovers/Yr
(4 * 10-7)4 = 2.56 * 10-26 Mutations/Generation
3.6 * 1028 Cells * 22.8 = 8.2 * 1029 Cells/Yr * 2.56 * 10-26 Mutatations/Generation = 2.1 * 104 Mutations/Yr
Given the large population size and high mutation rate of prokaryotic cells, what are the implications with respect to genetic diversity and adaptive potential? Are point mutations the only way in which microbial genomes diversify and adapt?
Implications:No: Point mutations are not the only way in which microbial genomes diversify and adapt. There can also be HGT between other bacteria, different levels of gene regulation/expression, insertions/deletions, etc.
What relationships can be inferred between prokaryotic abundance, diversity, and metabolic potential based on the information provided in the text?
Relationships Between Prokaryotic Abundance, Diversity, and Metabolic Potential: High Prokaryotic Abundance <–> Higher Diversity <–> Higher Metabolic Potentials (More prokaryotes will lead to higher diversity via mutations and mutations could contribute to better genes that help with metabolism)
Discuss the role of microbial diversity and formation of coupled metabolism in driving global biogeochemical cycles.
What are the primary geophysical and biogeochemical processes that create and sustain conditions for life on Earth? How do abiotic versus biotic processes vary with respect to matter and energy transformation and how are they interconnected?
Primary Geophysical Processes = Tectonics and atmospheric photocehmical processes Primary Biogeochemical Processes = Microbially catalyzed, thermodynamically constrained redox reactions Abiotic vs. Biotic Processes vary with respect to matter/energy transformation and how are they interconnected: Abiotic processes usually supplies biotic processes with substrates (Biotic processes uses up energy to sustain life and uses up matter to produce “waste products” whereas abiotic processes creates energy via transformation of “waste products” to substrates usable by microbes)
Why is Earth’s redox state considered an emergent property?
Earth’s Redox State = Emergent Property of Microbial Life on Planetary ScaleHow do reversible electron transfer reactions give rise to element and nutrient cycles at different ecological scales? What strategies do microbes use to overcome thermodynamic barriers to reversible electron flow?
Reversible electron transfer reactions –> Element + Nutrient Cycles at different ecological scales? Steps:Using information provided in the text, describe how the nitrogen cycle partitions between different redox ânichesâ and microbial groups. Is there a relationship between the nitrogen cycle and climate change?
N CycleWhat is the relationship between microbial diversity and metabolic diversity and how does this relate to the discovery of new protein families from microbial community genomes?
Relationship between microbial diversity and metabolic diversity: Higher microbial diversity leads to higher metabolic diversity since different microbes may have a better chance of survival under different conditions using a different resource for metabolism (High metabolic diversity) Relation to discovery of new protein families from microbial community genomes: Microbial communities with high diversity (In terms of species and metabolic diversity) will have a higher chance of finding new protein families due to mutations creating more efficient or functionally different proteins in microbes.
On what basis do the authors consider microbes the guardians of metabolism?
Microbes = Guardians of Metabolism
Achenbach J. 2012. Spaceship Earth: A new view of environmentalism. The Washington Post. Link
Canfield DE, Glazer AN, Falkowski PG. 2010. The Evolution and Future of Earth’s Nitrogen Cycle. Science. 330:192-196. Link
Falkowski PG, et al. 2009. The Microbial Engines That Drive Earth’s Biogeochemical Cycles. Science. 320(5879):1034-1039. Link
Kasting JF, Siefert JL. 2002. Life and the Evolution of Earth’s Atmosphere. Science. 296:1066-1068. Link
Leopold A, Schwartz CW. 1949. A Sand Country Almanac: With Other Essays on Conservation from Round River. Enl. ed. N/A
Nisbet EG, Sleep NH. 2001. The habitat and nature of early life. Nature. 409(6823):1083-1091. N/A
Rockström J, Steffen W, Noone K, Scheffer M, Teknik- och vetenskapshistoria (bytt namn 20120201), Skolan för arkitektur och samhällsbyggnad (ABE), KTH, Filosofi och teknikhistoria. 2009. A safe operating space for humanity. Nature. 461(7263):472-475.
Schrag DP. 2012. Geobiology of the Anthropocene. Chapter 22. Link
Suddick EC, Whitney P, Townsend AR, and Davidson EA. 2013. The role of nitrogen in climate change and the impacts of nitrogen-climate interactions in the United States: foreword to thematic issue. Biogeochemistry. 114(3):1-10. Link
Whitman WB, Coleman DC, and Wiebe WJ. 1998. Prokaryotes: The Unseen Majority. Proc Natl Acad Sci USA. 95(12):6578-6583. PMC33863
Zehnder AJB. 1988. Biology of anaerobic microorganisms. Research in Microbiology. Chapter 1. Link
Discuss the relationship between microbial community structure and metabolic diversity
Evaluate common methods for studying the diversity of microbial communities
Recognize basic design elements in metagenomic workflows
Specific emphasis should be placed on the process used to find the answer. Be as comprehensive as possible e.g. provide URLs for web sources, literature citations, etc.
How many prokaryotic divisions have been described and how many have no cultured representatives (microbial dark matter)?
How many metagenome sequencing projects are currently available in the public domain and what types of environments are they sourced from?
What types of on-line resources are available for warehousing and/or analyzing environmental sequence information (provide names, URLS and applications)?
Shotgun Metagenomics:What is the difference between phylogenetic and functional gene anchors and how can they be used in metagenome analysis?
Phylogenetic:What is metagenomic sequence binning? What types of algorithmic approaches are used to produce sequence bins? What are some risks and opportunities associated with using sequence bins for metabolic reconstruction of uncultivated microorganisms?
Types of Algorithms:Is there an alternative to metagenomic shotgun sequencing that can be used to access the metabolic potential of uncultivated microorganisms? What are some risks and opportunities associated with this alternative?